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ABSTRACT 


This thesis presents a versatile initialization design 
moereedynamic determination of physical resources in an 
adaptive manner for a multi-microprocessor environment. The 
design is general in nature and represents a structured, 
functional approach to the initialization process based on 
the use of dynamic resource mapping, knowledge passing 
between layered program components, and coordinated 
interprocessor communication. An implementation of this 
design is presented for initialization of the Secure 
Archival Storage System. The hardware architecture utilizes 
the cemmercially available, Z&000 based Advanced Micro 
Computer 4m96/4116 MonoBoard Computer, configured to support 


merormation security. 
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Pe Son LAODUCTION 


Initialization is the recurring process of effecting a 
normally running overating system on a given hardware 
configuration each time the system is started. This thesis 
presents a versatile initialization design for general 
application, which dynamically determines the system 
configuration in an adaptive manner. The initialization 
design is implemented specifically for a member of a family 
of secure, distributed, multiprogramming, multi-processor 
operating systems. The Secure Archival Storage System (SASS) 
is that member. A Zilog 26000 microprocessor—-based hardware 
mmemitecture was developed that will support ongoing SASS 
development, and is used in this design implementation as 
the hardware base. 

The key features of this initialization design include 
its versatility and general applicability, its use of 
dynamic resource mapving, its adaptive use of dynamically 
defined resources, and its hardware synchronization methods. 
The versatility is derived from use of knowledge passing 
rather than assumption between components. Dynamic resource 
mapping is the defining of hardware resources, (in 
Particular, processors and primary storage) as the 
initialization process progresses. The adaptive use of 


hardware resources is facilitated by the abi tity to 
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dynamically determine where to locate programs and data in 
primary storage. The hardware synchronization method makes 
use of a shared data structure to facilitate inter-processor 


communication and coordination. 


8. MOTIVATION 

O’Connell and Richardson[{1] outlined a high level design 
mor a family of secure distributed, multi-processor 
operating systems, with the primary motivation of (1) 
effectively Ecordinati ne the processing power of 
microprocessors and (2) vroviding information security. A 
Supset of this family, the Secure Archival Storage System 
‘SASS) [2,4], has been selected as a testbed for the general 
design. SASS will provide consolidated file storage for a 
network of possibly dissimilar “host computers. The system 
will provide controlled, shared access to multiple levels of 
sensitive information [5]. A complete description of this 
family of operating systems and the SASS can te fourd in 
reports prepared by Schell and Cox (25,26]. The hardware was 
selected and a development system based on the hardware was 
procurred. A Zilog ZEGGS microprocessor based Developmental 
Module [24] integrated with a 280 microprocessor based 
developmental system has been the instrument of continuing 
SASS research. Further efforts by Rietz{5], Coleman(3], 
Wells{6], and Stricxler[7] have brought the development to 


the poirt to where there is a need for a multiprocessor 
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environment. The Developmental Module does not support 
multiprocessor development. 

igesupport comeurrent work on the 5455 and to act as a 
Besting platform for the initialization design proposed in 
this thesis, the SASS Developmental Architecture was 
designed and Buell. At the foundation of the SASS 
Developmental Architecture is the commercially available 
Advance Micro Computer Am96/4116 MonoBoard Computer with a 
standard Multibus (INTEL) interface. In this application the 
intent was to match the hardware architecture to the needs 
of the operating system design, rather than the more common 
approach of building an operating system around available 


hardware features. 


B. BACKGROUND 

Two important key concepts have had an underlying 
influence on this thesis. These concepts are machine 
virtualization and dynamic reconfiguration. It is essential 
that these be understood before proceding into a discussion 
Seeimitializatiorn desien. 

The interfacing of hardware and software is a subject 
Matenm has not adequately been covered in the literature. In 
fact, the subject of initialization as a whole can not be 
Mererenced to any significant degree. This interfacing 
between the ‘bare machine and the software environment is 


called the basic machine interface. The basic machine 
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miterface consists of the set of all the software visibie 
ebjects and instructions that are directly supported by the 
hardware and firmware of a particular system. A recent work 
by Ross{i8] has produced a simple, flexible initialization 
design that establishes an environment for a basic machine 
interface. This is not supported directly on a bare machine 
but is instead supported in a manner similar to an extended 
machine interface. Extended machine interfaces have long 
peen used to allow application programs to run on different 
machines; they commonly take the form of interpretors. This 
tasic machine interface is known as a virtual machine and is 
a basic feature of the SASS. Koss” initialization design 
builds from the “bare machine to this virtual machine 
environment using a layered approach as described by 
Luniewski(i7], to establish the interfacing. 

The “bare machine refered to in most initialization 
designs is in some manner assumed. Luniewski in his design 
assumed a minimal hardware configuration Por the 
martialization mechanism. Given this minimal hardware 
corfiguration (which he defines to be a subset of the 
largest potential hardware configuration) his initialization 
mechanism employed dynamic reconfiguration to establish the 
actual “bare machine or hardware configuration. Dynamic 
reconfiguration is the changing of the system configuration 


while the system is running. From a hardware configuration 
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viewpoint, this means the dynamic addition or subtraction of 
Known hardware resources. 

Various hardware resources referred to in this thesis 
are shown in Figure I-1. A multiprocessor environment refers 
to having more than one processor interconnected with a 
system bus. Local memory is that primary memory accessed by 
only one processor; and zlobal memory is that primary memory 
addressable by all processors interfacing to the system bus. 
Data storage interfaced to the system bus having access 
controlled by a separate processor, is defined as secondary 


sStoraze. 


mee OBJECTIVES 

The objectives of this thesis are threefold: (1) to show 
the neei for a method of dynamic determination of hardware 
resources, (2) to present an initialization design to 
illustrate the methodology, and (3) to actually implement 
the design with the SASS and its hardware architecture to 
Show it is practical. An additional conscious effort was 
eae to include within this thesis all documentation 
concerning the hardware and software tools necessary to 
duplicate the results. Extensive use was made of the 
appendices for this purvdose. 

Reference [27] presents a summary description of the 
Mereware for the SASS. It contains detailed wiring 


confizurations for both the MonoBoards and Ram Memory Boards 
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"Am96/1209), including detailed description of the wiring 
meegitication for information security of the MonoBoard local 
memory. It also contains the SASS Developmental Monitor 
program listing with a command syntax tutorial, the Bootload 
program listing for the firmware, and the Bootstrap program 
listing for use with the SASS. The bootstrap program can bde 
adapted to other operating systems by changing the manner of 
loadjing from secondary storage. In addition, reference [27] 
contains the listings for programs written in support of 
this thesis effort. These programs were used to effect the 
Moect code file transfer to the Intellec Microcomputer 
Development System (MDS) for programming the EPROM’s 
(electronically programmable 20M). The same program also 
Senved for transferring the source code listings contained 


in these appendices, for text processing. 


Ds. THESIS STRUCTURE 

In this chapter a brief discussion was presented on the 
motivations and influences brought to bear on this thesis 
effort. The objectives were stated and the documentation 
goals were established. Chapter II addresses the subject of 
initialization and proposes an PNitialization design 
methodology. Chapter III presents the SASS and its hardware 
configuration to allow for a complete environment 
definition. This environment served as a model for exploring 


issues pertaining to initialization design. Chapter IV 
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describes the implementation for. the actual hardware 
architecture. The final chapter presents conclusions and 
observations that resulted from this thesis effort and 


suggestions for further research. 


an, 





II. INITIALIZATION DESIGN 


The objective of system initialization is to get an 
operating system loaded and running on a computer system. 
This task must be accomplished each time a computer system 
is powered-up or a change to a new or revised operating 
System is made. In the past, this has been considered an 
implementation detail, specific to a egiven system. AS a 
result, existing system initialization schemes are not of a 
general nature or structured in design. This chapter will 
examine system initialization from the standpoint of a time 
ordered seauencing of activities and functional grouping of 


tasks. 


A, INITIALIZATION CONCEPTS 

The general form of system initialization is to have a 
bootload medium, that contains the necessary programs and 
data, bring in the operating system from some external 
Storage device and effect normal operations. The manner in 
which this is accomplished can be identified as occurring in 
three time ordered phases: system generation phase, 
initialization phase, and run time phase [17]. System 
gereration time is that phase where in the bootload medium 
and operating system core images are generated (created). 


This usually occurs in the same environment in which the 
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operating system is developed. Initialization time is that 
period of time from initial power-up of the computer system 
to the point where the operating system is running normally. 
The time after the system is initialized, when it is running 
normally is called the run time phase. 

Luniewski({i7] proposed a simple and versatile mechanism 
for system initialization based on these three phases and 
the underlying premise that an activity performed at system 
generation time or at run time is inherently simpler than 
the same action performed at initialization time. His method 
is based on the idea of a layering of functions begining 
Wilton an assumed minimal configuration, and the concept of 
dynamic reconfiguration [11] to develop the system on which 
the core image of the operating system will run. The minimal 
configuration, which is a subset of resources contained in 
the full hardware configuration, was assumed to have a 
Single processor, a #seiven primary memory of known size and 
known physical address, and system tables of known size. 
These assumptions allowed for a significant amount of 
initialization tasks to be overformed during the system 
generation phase and run time initialization (viz., specific 
resource virtualizations). 

These initialization concepts were employed in 
subsequent work by Ross{18] to design an initialization 
mechanism for a particular real-time application. His 


approach defined the initialization phase as a ‘two load’ 
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operation where a ROM resident (in hardware) bootload 
program loads a core image of the bootstrap program which 
procedes to load and develop the environment for the full 
operating system core image. This concept of a bootload 
program and bootstrap program is driven by the desire to 
keep the bootload program (in ROM) small for hardware 
Sponsiderations. 

Fach of these initialization programs (i.e., bootload, 
bootstrap, operating system core image) are statically 
provided with information about the others. The bootload 
program knows at what physical address to load the bootstrap 
program based on the minimal configuration assumed. The 
bpoctstrap program knows at what physical address to load the 
operating system core image based on the restrictions caused 
by absolute addressing contained in the operating system 
code; in addition the bootstrap program makes fixed physical 
resource allocations baset on a knowledge of operating 
system core image system tables. The operating system knows 
(he processor(s) data structures (i.e., stacks, PSA) from 
information statically provided at system generation time. 

A more versatile initialization design would be to take 
a dynamic appreach to information passing between 
miebialization programs. fea Ss requires a dynamic 
determination of the environment in an adaptive manner; 
knowledge of real resources is gained experimentally, with 


the new knowledge then used to gain further definition. The 
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total resource knowledze would then be passed to the 
resource managers. In this manner a layered real environment 
would build to the base of the layered virtual environment 


refered to in the above works. 


B. SYSTEM GENERATION PHASE 

The systen generation time includes all manner of 
software development TOM system initialization when 
considered in the broedest terms. a more restricted and 
Classical view proposes generaticn of the operating system 
core image and perhaps a portion of the bootstrap program. 
Irn any case the software developed specifically and of 
necessity for the operating system of concern is produced 
Murine this phase. As discussed earlier, as many 
initialization tasks as possible should be allocated to this 
ohase and the run time phase. During the system generation 
phase the production environment is more hospitable, having 
pull operating system services available; a similar 
mmeereronment exists at the begining of run time. The 
initialization phase on the other hand, has only the bare 
Meepcware and what it can built from this to run on. 

The software used during the initialization process is 
composed of the Dbootload program, the (possibly several) 
bootstrap program, and the operating system core inage. All 
are necessary to effect initialization in a two load’ 


environment; however, not all are required to be produced 
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during the sane system generation phase. Using the proposed 
adaptive method for dynamic determination of Piysaica | 
resources should allow some degree of portability for the 
pregrams. For instance, the bootload program can dynamically 
determine where to load the bootstrap programs; and the 
mootstrap program(s) can in turn do the same with the 
operating system core image. This would mean that the 
vootload program could service a variety of different 
bootstrap programs; anda single bootstrap program could 
support a variety of bootstrap loader programs, each 
produced specifically for loading a given operating system. 
Cne restriction on the program development must exist 
during the system generation phase to support the proposed 
method. In the absence of a linking loader in the 
initialization code, which has been dismissed adequately by 
Luniewsxi{17] as impractical, program development must 
proceed in an environmert free of absolute addressing. This 
will insure that, regardless of where the program’s core 
images are positioned in physical memory, code execution can 
be started and wili continue to completion. In the case of 
the ROM resident bootload program, its application will not 
be limited to a specific hardware architecture; and the 
bootstrap programs and operating system core image will have 


the same degree of configuration independence. 
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Ge INITIALIZATION PHASE 

The initialization phase begins at the moment a hardware 
“pootload signal is apolied to a computer system and ends 
with an operating system running normally. As stated above 
the particivants of this phase are a bootload program, one 
or more vboctstraDd programs, and the operating system core 
image. These elements can serve as natural partitions for a 
time sequencing of the bootload phase. Execution starts in 
the bootload program, control is passed to the next 
Sequencial bootstrap program, and so forth. Once the control 
Tiow has left a given partition, it does not return. The 
mieyve exception might be the case of a return to a monitor 
based program for a developmental system, which most likely 
will be ROM resident. This monitor program, nowever must be 
considered a separate program from the bootload program. 

Within each of the time sequencing partitions the tasks 
performed can be grouped into stages for ease of definition. 
These stages are not recessarily disjoint in times some 
Parts of the tasks may be performed concurrently. This 
initialization phase organization is shown in Figure II-1. 
Bootload operations occur in three stages: (1) independent 
processor stage, (2) cooperating processor stage, and (3) 
local initialization stage. The bootstrap program operations 
are divided into the global initialization stage and the 


core image load stage. 
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1. Bootload Operations 


Those initialization tasks performed in the bootload 
mmeerdm occur in three stages: the independent processor 
stage, which is characterized by a single processor 
awareness and a variable free environment; the cooperating 
Processor stage, which begins with the first use ofa 
multiprocessor mutual exclusion mechanism to facilitate 
multiprocessor shared memory in a coordinated fashion; and 
local initialization stage in which the local software and 
hardware initializations occur. The bootload program, as 
Stated earlier, is ROM (or EPROM) residents this realization 
of the bootload program in hardware is commonly called the 
firmware and will be termed thus throughout the remainder of 
jmombo thesis. 

a. Independent Processor Stage 

The beginire of the initialization process is 
defined by an initial execution point and an initial address 
Space (in keeping with the definition of a process). The 
processor itself has only limited internal resources and no 
knowledge of any other processors. [ts internal resources 
memse2st solely of its register structure. The initial 
execution point is obtained by the processor from an address 
location defined internally within the processor. Commonly, 
On power-up or reset all internal registers are cleared 
(zeroed) and the instruction counter then used to ‘fetch’ 


the initial execution point. In the case of the 28002, for 
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example, physical address zero would contain the initial 
execution point: in our implementation one address (¢202 
H5X) would contain the processor status (FCW) and another 
(@@C4 HEX), the program counter (PC) value. The initial 
address space as known by the processor is the firmware. 

The goal of the independent processor stage is 
to dynamically determine the existence of physical memory 
resources; knowledge as to the size and addresses must be 
obtained. Without this knowledge, the processor is working 
mame variable free environment, using only its registers for 
data storage. 

This dynamic function of memory determination is 
accomplished in three tasks: clearing of memory, defining 
memory, and mapping of memory. Figure II-2 presents 
pseudo-coding for each of these functions. The clearing of 
memory reauires that portions of memory being mapped, bdve 
brought to a known state, which removes the probabilistic 
aspects of the task. Conventional notions of a memory map 
divides the memory into blocks. These blocks will be sampled 
during the mapping to dynamically determine if they exist. 
The location for sampling within each block must be brought 
to this known state. Defining memory then becomes a task of 
checkine for read/write capability of each memory block. 
Bree the existing locations are found, it is possible to 


build the entire memory map, during the memory mapping task. 
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CLEAR MEMORY: 
BLOCK ADR := START BLOCK 
SCRIBE ADR := BLOCK AD2 + 1 
DO 
Q@BLOCK_ADR := B/W PATTERN 
ASCRIBE ADR2 := @ 
BLOCK ADR := BLOCK 4DR - BLOCK_SIZz 
SCRIBES ADR := BLOCK ADR - 1 
UNTIL ‘BLOCK ADR > BND 3LOCK) 
WAIT 


SGRI Beem EMORY: 
BLOCK ADR := START ADR 
SCRIBE ADR := BLOCK ADR + 1 
DO 
IF (@BLOCK_ADR = R/W_PATTERN) THEN 
ToOCk. SYSTEM BUS 
@SCRIBE ADR := @SCRIBZ_ADR + 1 
UNLOCK SYSTEM 3US 
BLOCK ADR := BLOCK 4DR = BLOCK_SIZE 
SCRIBE 4DR := BLOCK ADR + 1 
UNTIL (BLOCK _ADR > END_ ~3L0cX) 
WAIT 


DEFINE MEMORY: 
LOW GLOBAL BLOCK := NON-EXISTENT 
LOW LOCAL BLOCK := NON-EXISTENT 
CPU COUNT := 1 
BLOCK ADR := END 2LOCK 
SCPIBE ADR := BLOCK ADR + 1 
DO 
IF (@BLCCKY ADR = R/W PATTERN) THEN 
IF (@SCPIBE_ 1DR = 1) TEEN 
LOW LOCAL BLOCK := BLOCK ADR 
ELSE IF (@SC2ZIBE_ ADE >= CPU COUNT) THEN 
CPU COUNT := @SCRIBE ADR 
LOW GLOBAL BLOCK := BLOCK ADR 
BLOCK ADE := BLOCK ADR - BLOCK SIZE 
SCPIRE ADR := BLOCY ADR + 1 
HESS 
PLOCK_ADR := 3LOCK_ADR - 3LOCK SIZz 
Somat or = SLOCK ADR + 1 
UNTIL (BLOCK_ADR < START ADR) 
RETURN (LOW LOCAL BLOCK, LOW GLOBAL BLOCK, 
CPU-COUNT) 


independent MEece S501 wanase 


Figure I[I-2 
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A sample Location is brought to the known state 
Mri tine to it a4 Specific pattern. This pattern is used to 
check the operational status of the hardware memory devices 
‘RAM). Each device must be able to contain data in bdoth 
states, i.e., 1 ard ¢; therefore a write-then-read operation 
should be performed with 1°s and @°s. In a byte oriented 
memory organization, for example, 3AM may consist of eight 
devices, @ach contributing one bit for each byte of data. In 
such a case, the pattern would be written and read from one 
byte ‘e.2., °55”) and the inverse pattern (e.g., “AA”) 
written and read to another. Using this example in a 16-bit 
architecture would reauire a one word (°55AA’) read/write 
operation to verify the operational status of a block of the 
byte oriented memory organization. 

The size of the blocxs that divide the physical 
memory space is determined from consideration of the 
hardware organization of memory, the sizes of any natural 
partitions (i.e., ROM), and in certain cases the aspects of 
memory Vireuaid 2a tion 1K be used later. In our 
implementation, for example, the minimum size of the blocks 
was determined by the minimum ROM size that the processor 
architecture could support (2K bytes). The entire physical 
address space is then divided into blocks of this size, and 
an appropriate lecation within the block (usually the first 
address) designated for sampling. The individual physical 


memory devices are checked by use of the read/write pattern 
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in the manrer described above. Once a device is found 
Operational and accessible for one address, it can be 
assumed the same for all addresses within the block since 
each address also enables the device. The entire physical 
address space, taxen in blocks, is then cleared, defined and 
mapped. 

Included with the code contained in firmware is 
an identifier to be used by the processor for its own unique 
identification within the system. This value can be put into 
the firmware by serializing each FOM as it is programmed 
with the tbootload program. At the end of the independent 
processor stage the processor has knowledge of its own 
memory space (RAM) in the form of a memory map; its own 
fee processor identification obtained from serialized 
firmwares and the iocation of its firmware within the 
addr@ss space. 

db. Cooperating Processor Stage 

The cooperating processor stage begins with the 
assumed existence of other processors, which reauires the 
use of a mutual exclusion mechanisn to preserve the 
integrity of a shared memory. Some read-alter-write. 
operation (e.g., a test and set instruction) is needed. A 
typical read-alter-write operation involves fetching a value 


from memory, modifying the value, and writing it back. 
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The objectives of this stage are to define the 
multiprocessor environment and to select a ‘bootload 
processor. to coordinate the loading of, and transferring of 
control between bootstrap and operating system core image 
memoerems. A single processor for interfacing with secondary 
storage is necessary to prevent tne interference that would 
result from nultivle commands to sé@condary storage. The 
addresses to be loaded from secondary storage must be 
determined by only one processor. Definition of the 
multiprocessor environment requires determining the total 
number of DROGIECO REE thelr cooresponding unique 
identifications, and the existence of any shared memory. 
Fach of these is facilitated by a method of having each 
processor ‘make itself? known to the multiprocessor system; 
this is done by scribine memory. 

Scriveeneg is the operation nije which each 
processor uses the read-alter-write operation with mutual 
exclusion, to increment by one ‘make itself? known) the value 
of a pre-defined location within each accessible block of 
memory. For the 4m96/4116 MERC mutual exclusion is provided 
by locking” the system bus to prevent other processors from 
interferrirg with the operation. After all processors have 
completed scribing their own address space, the highest 
value found at this location from all memory blocks is the 
total number of system processors which are defined to have 


global memory. Global memory is composed of those memory 


oe 





blocks having the marimum number of processors scribed into 
them. Those memory blocks having less than the maximum but 
ereater than one processor (local memory) are not considered 
local or global. They cannot function as local memory 
because of the possibility of information overwrites oor as 
global memory because they are are not accessible by all. 
Use of this memory becomes an implementation detail. The 
consolidation of all processors’ memory maps into a system 
memory map and the assignment of logical CPU numbers (system 
use) to physical processors (unique ID), would complete’ the 
multiprocessor environment definition. 

To coordinate the multiprocessor efforts in 
defining this system environment, a bootload processor, 
called the bootload CPU, must be established. The bootload 
CPU coordinates the activities of the other processors by 
use of a system data Structure known as the Configuration 
Table. This structure, as seen in Figure [II-3, is initially 
used to determine the B2ootload CPU. The table is implicitly 
established at an available global memory address, and a 
deliberate race condition is effected to select the Bootload 
CPU. Fach processor attempts to gain exclusive access to the 
configuration table by successfully setting the table lock; 
a test and set operation must be performed to set this lock. 
The use of the table lock insures sequential access to the 


table. 
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Cont ueirat1e7 Table 
Figure II-3 
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To insure the integrity of this test and set 
operation in a multiprocessor environment, a mechanism to 
lock the system bus must be employed. The locking of the bus 
prevents other orocessors from using it. An intentional race 
condition is established whereby each processor attempts to 
lock the system us $0 as tO Rain access to the 
configuration table. This race condition can then be used to 
differertiate the processors and thereby select the bootload 
CPU. For instance, in the algorithm depicted in Figure II-4, 
each processor obtains its own logical CPU number on entry 
maooe the table and determines if it is the bootload CPU. The 
bootload CPU has logical CPU number 94; all others become 
member CPU“s. 

The cooperating processor stage algorithm as 
Shown in Figure II-4 presents the sequence of events for 
both the bootload CPU and the member CPU’s. The bootload and 
bootstrap programs executed by the processors contain the 
algorithms for both the bootload and member cCPU’s. The 
bootload CPU begins by incrementing the logical CPU number 
for the use of the next processor to access the table. It 
then clears the configuration table to accommodate an entry 
by each processor, enters its own unique ID, maps and enters 
its own memory map, and then unlocks the configuration table 
for access by the member CPU’s. The memory map is a data 
structure representation of the local/global determination 


of @€ach memory block. The bootload CPU then waits (observing 
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the logical CPU numbers) for all member CPU’s to complete 
their entries before proceeding. 

Member CPU’s use their logical CPU numbers to 
index the configuration table CPU list to determine where to 
meeace its entry. After entering their own unique ID and 
memory map, e@ach member CPU increments the logical CPU 
Humber indicator, unlocks the configuration table, and waits 
for an appropriate signal from the bootload CPU to proceed. 

The “signal and “wait operations performed 
here are of the type employed by operating system traffic 
controllers for synchronization between processors, though 
of a more primitive form. A processor ‘waits by looping, 


looking for the occurrence of an event; a processor 
“signals” by posting or otherwise establishing the 
occurrence of an event. The algorithms cf Figure II-4 make 
Meee ot the signal blocks of the configuration table to 
cdeterministically signal the passing of information in the 
message blocks and the eprantinge by the bootload CPU of 
permission to proceed. 

A key issue at this point is the way in which 
the bootstrap and operating system core images are loaded. 
In general, each image is loaded into global memory by the 
bootload CPU; and each member CPU must copy its possibly 
different, core imaze into its own local memory. This method 


of downloading the core images allows for the possibility 


of different types of processors in the architecture, and 
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compensates for the fact that the bootload CPU cannot 
address the local memory of any other processors (i.e., not 
using dual-ported memory). In the more common homogeneous 
mmoecessor architecture, a shared bootstrap program can be 
utilized in global memory. 

The bootstrap program load is performed by the 
Bootload CPU interacting with a known (at system generation) 
secondary storage device. The global memory load address is 
determined dynamically by the Bootload CPU with use of the 
configuration table. The entry address of the bootstrap core 
image is loaded into the message block’s of each processor, 
and all are signalled to proceed. With this method of 
individual message blocks, each processor may download a 
different bootstrap program. Transfer of control is effected 
when @€ach processor obtains the entry point from its message 
block and executes a call to that location. Each processor 
passes its own logical CPU number as a pass by value 
parameter in this call, and the location of the 
configuration table is passed as a ‘pass by reference 
parameter. After the transfer of program control out of the 
firmware, the bootload program is no longer used. The 
firmware can if desired, be disabled or disconnected for 
re-use of the physical address space it occupied. 

As can be seen in Figure II-4, the bootload CPU 
then procedes to determine the address in which to load the 


operating system core image, while the member CPU’s wait for 
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the appropriate signal to continue. After the core image is 
leaded, the bootload CPU signals all processors to leave the 
pootstrap program and enter the base layer of the operating 
system. Bach processors’ logical CPU number and the address 
of the configuration table are again passed into the new 
program. Once the flow of program control has left the 
bootstrap program, the memory space that it occupied can be 
re-used. 

In summary, it should be pointed out that 
nothing was known of the bootstrap program (except its 
secondary storage address) by the bootload program, or of 
the operating system core image by the bootstrap program; 
end the only common link is the knowledge contained in the 
configuration tabl@ which was dynamically determined and 
passed as a parameter to each successive program. 

ec. Local Initialization Stage 

The local initialization stage is contained 
within the firmware. The tasks performed by this stage are 
hardware dependent and include both hardware and software 
initialization functions specific to each single processor. 

Hardware initialization functions include 
initializing specific purpose internal CPU registers and 
external devices. Specific purpose registers include 
pointers and counters used for specific functions on which 
the hardware depends for normal operation (i.e., PSAP, stack 


pointer, and REFPESH registers). External devices include 
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the various input/output interface devices, processor 
clock/counter devices, auxillary processing components, and 
interrupt controllers. Each device state or function is 
determined under software control; the CPU must individually 
program each device. 

Software initialization involves setting the 
processor s data structures and variables. Examples of these 
types of data structures include processor interrupt/trap 
jump vectors and stacks. All tasks performed in this stage 
are highly implemertation and hardware oriented. The local 
initialization stage in this initialization design contains 
the program information that must be changed for 
applications to different hardware architectures. 

2. Bootstravd Operations 

Bootstrap operations begin after transfer of program 
control flow from the f’irmware to the bootstrap core image. 
Two eroupings of tasks are found in the bootstrap program, 
the global initialization stage and the core image load 
Stage. The global initialization stage performs the hardware 
resource knowledge consolidation for the system, and the 
creation of operating system data structures. During the 
core image load stage the operating system is loaded anda 
Signmaled transfer of control occurs which marks the 


completion of the initialization phase. 
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HaecLoud! [nwriallzateaon Stage 

The Conrrecuration table Comtaltnsmeresource 
information about each individual DEOCeS Sor. This 
information must be interpreted from an integrated system 
viewpoint in order to establish the system resources. The 
Bootload CPU consolidates entries in the configuration table 
to produce the information useful to the base layer of the 
operating system. The logical-to-physical CPU mappings of 
the processors is already available, but might be 
re-arranged into a more convenient format. 

Individual memory maps are consolidated into one 
system map primarily for global memory determination. 
Consolidation produces a system memory map showing total 
size and addresses of global memory. The local memory map of 
each processor has no meaning on the system level except to 
collectively scope the address space range within the 
system. 

Durirg this stage, the bootload CPU dynamically 
determines the address location in which to load the core 
image of the operating system. A pre-allocation entry for 
the core image is made within the system zlobal memory map. 

b. Core Image Load Stage 

Loading of the operating system base layer is in 
general a two-load operation. The Bootload CPU determines a 
Suitable global memory location from the system memory map 


and loads a portion of the base layer core image. This part 
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of the operating system includes all processor local code 
(distributed) and data structures (if pre-established at 
system generation time). Each member CPU is signaled by the 
Bootload CPU to “down-load this core image to an address in 
local memory determined individually by each processor. The 
Starting address of the core image in glotal memory and its 
size are passed in the message block’s of the configuration 
table. Again, individual message blocxs allow each processor 
to possibly have a different core image to download. As each 
processor completes the task and updates its individual 
memory map, it increments the CPU count in the configuration 
meoee tO indicate this. The Bootload CPU is then aware of 
total task completions each member CPU again waits for a 
Signal to continue. 

The second part of the operating system base 
layer to be loaded by the Bootload CPU is a global resident 
core image. The previous core imaze, having been down-loaded 
by each processor, can now be over-written in globdal memory. 
The Bootlcad CPU loads the global portion cf the base layer 
core image into an appropriate location in global memory, 
allocates the locations in the system zlobal memory map, and 
Signals this fact to the member CPU’s along with the global 
mecatlon. All processors transfer control to a single dase 
layer entry point, either in local or global memory. This 
marks the end of the initialization phase and the begining 


of the run time phase. 
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ie RUN TIME PHASE 

The begining of the run time phase starts with execution 
of the base layer of the opnerating system. The run time 
phase contains an additional initialization stage which 
primarily serves to format the system configuration 
knowledge according to cperatineg system specifications, and 
to virtualize the physical resources prior to normal system 
operation. 

me run time Initialization 

Unlike the bootload and bootstrap programs that 
executed on the bare system hardware, the initialization 
routines of the operating system base layer can be supported 
by the operating system functions. But before these support 
functions are available, the system wide data structures for 
resource management must te created. The tootload CPU 
dynamically determines the location o? these system tadles 
in global memory from the system memory map, and procedces to 
build the tables based on information contained in the core 
meee about the structures. 

Once the operating system resource management tools 
have been constructed, the beotload CPU signals the member 
CPU’s to begin execution of the tase layer of the operating 
system. The primitive processor synchronization mechanism 
used thus far is again used. After each processor begins 
execution of the base layer of the operating system, the 


more sophisticated synchronization mechanisms On the 
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operating System are available, and the bootload CPU 
meotinction is no longer required. 

All functions overformed during this stage are 
executed only once; therefore care should be taken to keep 
the program sections small or to reuse the memory space 
after execution. 

2. Run Time Load 

Any additional code or data reauired from secondary 
storage to effect the layering of the operating system, can 
be obtained by use of a loader process within the runnine 
Operating system, for example, the SASS supervisor for file 
system Pnitialbyzation. Normal DROCeSS FSynCchronization 
services are available to the loader process. A further 
discussion and example of a loader oprocess has deen 
presented by Anderson({19]. It is interesting to note that 
the loader process can contain many of the fault tolerant 
aspects of the system. Once the operating system is running 


Mormally the initialization process is completed. 


KB. SUMMARY 

An initialization design for dynamic determination of 
the resources in an adaptive manner was presented in this 
Chapter. The three phases which sequence the initialization 
process in time were discussed; the program mediums used to 


perform the initialization were broken down into functional 





stages; and a dynamic parameter passing technique between 
mediums was presented. 

The Seen 1 Cane features include the general 
Meeescability of the design, the dynamic resource mapping 
routines, and the processor synchronization method. The 
general applicability of the design is based on independence 
of program units, VAZa oe ot load program, bootstrap 
programs(s), and operating system core image. Dynamic 
resource mapping is performed on system processors and 
memory. The processor synchronization method makes use of a 
Meoeal data structure known as the configuration table to 
facilitate inter-processor communication, and a randomly 
determined GComtrol i ne processor for synchronizing 
processors and interfacin2 secondary storage. 

Chapter IV will apply this initialization design to the 
SASS Develecpmental Architecture to implement initialization 
mor this overating system. The next chapter will define the 
environment in which this initialization design will be 
implemented. The Secure Archival Storage System will be 
examined to determine what minimal configuration is required 
mye the base layer; and the hardware architecture built to 
Support the SASS will be studied as part of this minimal 


monmticuration. 
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ITI. ENVIRONMENT DEFINITION 


A. OSJECTIVES 

mee first consideration in system initialization is the 
environment definition. As previously stated, system 
initialization produces a loaded and running operatine 
system. The operating system runs on what it knows as the 
bare machine or the system confieuratior. system 
configuration consists of the software configuration and 
hardware configuration [17]. The software configuration 
consists of the values of various system parameters and the 
sizes of the required data structures or system tables, i.e. 
the amount of available memory or a bit map of secondary 
Storage. The hardware configuration is defined by the 
collection of hardware modules comprising the system, and 
the manner 5) e which they are Commected . system 
initialization is a step-wise evolution ?rom from a fixed 
'@.2., PROM-resident) bootload program running on the 
minimum basic hardware to the actual running operating 
System. A definition of the environment is necessary to 
merect initialization. 

The objective of this chapter is to illustrate this 
definition process by actually studying a state-of-the-art 
Operating system, the Secure archival Storage System (SASS), 


and the assembled developmental arcnitecture on which it 
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will run. The operating system (SASS) will be discussed in a 
top-down manner from its desired functionality to its 
required system configuration. For a more complete analysis 
Mie reader is encouraged to Cons tilat the referenced 
literature. The coverage presented in this thesis is meant 
moe tamiliarize readers with the design structure. The 
following background is necessary to complete the coverage 
for all readers. Those already familiar with basic operating 
System concepts may want to skip to section C. The hardware 
architecture is covered to the same depth of detail in 
section D; however, supplementary background information and 


architectural details are included in reference [27}. 


me OSCURE OPERATING SYSTEM CONCEPTS 

The operating system to be considered iS the Secure 
Meemivai Storage system which is a subset of a family of 
secure, multi-nicrocomputer operating systems described by 
O’°Connell and Richardson [1]. Two primary motivations in the 
design of this family of secure operating systems were (1) 
to effectively coordinate the processing power of multiple 
microprocessors, and {2) to provide information security. 
Before presenting an overview of SASS, a few fundamental 
operating system and information security concepts that 
directly relate to these design motivations should be 


reviewed. 
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1. Multiprogrammineg and Multiprocessing 


Fundamental to the concept of multiprogramming is 
the notion of a process. A process can be described [8] as a 
set of related procedures and data undergoing execution and 
manipulation, respectfully, by one of possibly several 
processors of a computer. It may be considered as a logical 
entity characterized vy an address space and an execution 
point. A process’ address space consists of the set of all 
memory locations accessidle by the process during its 
execution. Its execution point is the internal state of the 
processor on which the process is running, at the instance 
of execution. Both the processor internal state and the 
running process’ address space can be preserved and restored 
at a later time. This ability to store or re-~instate 
processes on processors is called process switching or 
Bencext switching. 

Multiprogramming is the use of process switching in 
a Manner as to have more than one process in a state of 
execution at the same time. Asynchronous multiprogramminzg 
requires communication between processes POT 
Synchronization, e.@., advance and await [16]. Logical 
attributes of processes include identification (unique ID or 
Meocessor affinity), classification (interprocess priority 
Or security access class), and state (execution state). For 
example, in SASS each vrocess is given a unique identifier 


that allows for its identification by the system. It is also 
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mayen a security access class, at the time of its creation, 
to specify what authorization ant possesses. Process 
execution states allow for multiplexing of processes 
(pinding) onto processors. A process that is bound to a 
processor is asSigned a “running” states a process in the 
‘ready’ state is waiting to be bound to a processor; and a 
process that is ‘waiting” is awaiting some event in the 
system before continuing execution in the ‘ready’ or 
‘running’ states. Multiprogramming is logically a form of 
parallel processing, where different processes are in a 
state of executing sinultaneously. 
4s can be inferred from the preceding paragraph, 
parallel processing does not reauire more than one 
processor. However, in a multiprocessing environment 
Barallel processing can be more effective. Multiprocessing 
implies more than one central processor in the hardware 
configuration during system execution. 
2. Memory Segmentation 
Memory seementation ins a form of memory 
virtualization. A segment can be defined as a logical 
grouping of information, such as procedures or data areas, 
that are of variable length. The address space of each 
process is comprised of a collection of those segments that 
Can be accessed by that process. Since a segment is a 
logical unit, it can have logical attributes as does each 


process. Segmentation thus facilitates enforcement of 
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controlled access. to segments vy processes Loar On 
comparison of certain logical attributes, e.2., access 
class. Access within a segment is made by two-dimensional 
addresses: segment specifier and offset. 

Segmentation permits multiple processes to share 
data and code segments, and thus avoid the ‘multiple copy’ 
problem. DBhas eliminates the Dossier tive Of having 
conflicting data when multiple copies of the same segment 
are maintained. Also a reduction in the number of copies 
reduces the amount of physical address space used. Shared 
segments greatly facilitate inter-process synchronization 
ard communication for cooperating processes. 

S&. Abstraction 

Dijkstra [15] has shown levels of abstraction to be 
a powerful design methodology for complex systems. In 
general, the use of levels of abstraction leads to a better 
design with ereater clarity and fewer errors. Simply put, 
abstraction is the application of a general solution to a 
number of specific cases. More precisely, the metnod of 
abstraction can be thought of as a methodology for machine 
virtualization, where each successively lower levels are not 
@Wware of the abstractions or resources of any hizher levels. 
Higher levels may apply the resources of lower levels only 
by making use of the virtual machine provided by the lower 


level supporting it. These two rules reduce the number of 
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interactions amoung levels of the system and two rules 
reduce the number of interactions amoung levels of the 
System and also contribute to configuration independence in 
the overall design. 

Each level of atstraction creates a virtual machine 
upon which the next level runss no knowledze of lower 
machine implementation is of concern to higher levels. 
Levels of abstraction can be applied consecutively down to 
the hardware architecture if desired. Following the rules of 
abstraction results in a loop free design. 

mee rrotection Domains 

Protection domains are used to arrange process 
address spaces into rings of different privileges{5]. The 
structure essentially divides the address space into levels 
of abstractions with strictly enforced ‘ring crossings” or 
gates. These gates protect the machine hierarchy by 
enforcing virtual machine boundaries for some, vt not 
necessarily all, levels of abstraction. The inner most ring 
or domain is commonly considered the most privileged. 

eee Kernel Desien 

Asynchronous operating systems generally fall in two 
categories: the monolithic operating system apvroach and the 
kernel approach. An operating system using the concept of a 
monolithic operating system is based on the premise that the 
operating system provides both resource management and the 


mumerous common services required by user programs. User 
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programs and system devices do not communicate directly; all 
interaction is passed through an operating system. All 
functions performed by the overating system are contained in 
one large program module. 

An alternate approach provides a hierarchy of 
“virtual machines. It is based on a fundamental operatireg 
system module called the ‘Kernel’. The system’s resource 
management activities are ninimized and concentrated in tne 
kernels various asyrchronous activities are moved to 
asynchronous system processes, as fits the notion of a 
process as described earlier. These system processes are 
each given their own virtual processor, which is multiplexed 
onto the real CPU by the kernel. These virtual ovrocessors 
compete in a multiprozramming sense with other virtual 
processors onto which apdvlication processes are being 
multiplexed. Kernel activities normally include 
multiprogramming and avplication inter-process communication 
manctions. 

This smaller module (kernel) is more readily 
distributed within the address space of each process and is 
Moe: Constructed as a hierarchy of virtual machines, 
thereby proviiing a loop-free, configuration independent 
Seructure. The distributed kernel is a key property in the 


SASS design. 
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ee intormation Security 


System security reauires the implementation of a 
security policy from both external and internal aspects [1]. 
Total reliance on external methods (outside the operating 
system) precludes the controlled sharing of multilevel 
information within a systems total reliance on internal 
methods (provided by operating system) leaves the physical 
media unprotected. Clearly, systems should employ a 
combination of the two. 

A typical security policy consists of discretionary 
and non~discretionary aspects. Non-discretionary security is 
the partitioning cf entities into separate, but related 
Meaositfications. Jt can best be related to information 
security through the abstraction of the reference monitor 
[1]. The reference monitor is composed of subjects, objects, 
and an access matrix. From an cperating system standpoint, 
the notion of a process generally fits the abstraction as a 
subject; and data or programs coorespond to objects that can 
be accessed by subjects. The access natrix represents the 
permitted accesses between subjects and objects; @ach matrix 
contains a lattice structure [13] evel defines the 
relationship between different access ciasses. The 
relationship between subjects” access class {SOA) and 


Meyects access class (OAC) is as follows: 
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1, SiceSmonc read/write permitted 
2. OC > OAC read only permitted 


oe. one < OAC no access permitted 


Says lattice structure can be partially ordered (nct all 
classes related) or totally ordered (all classes relatei) as 
memerne typical DOD classification of secret, confidential, 
@eeunclassified. 

Discretionary security is subserviant to 
non-discretionary security in that the later dominates any 
security interpretations. Discretionary security allows for 
separate, commonly thought of as internal, partitioning 
Within the ron-discretionary lattice structure. A typical 
example is the DOD “need to know” policy where controlled 
access is granted or revoxed within the non-discretionary 


molicy. 


Oe THE SECURE ARCHIVAL STORAGE SYSTEM 

’s stated earlier, the SASS is a member of the family of 
secure operating systems designed by O’Connell and 
Fichardson [1]. Functionally, it was designed to provide 
multiple host comouter systems with controlled, shared 
Becess to a multilevel secure archival storage. This 
requirement leads to the design goals of internal security 
memeprotect information flow in a distributed computer 
Mmetwork, configuration independence for doth system 


versatility and security suvoport, and general subsetting 
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Bapability to support future system configurations. A 
Memeral SASS organization can be seen in Figure I[I-1. 

The key elements in the SASS design are the use of 
security kernel technology [12], a distributed operating 
System, and resource virtualization. The security kernel is 
a kernel based desien which includes a realization of the 
abstraction of the reference monitor. Non-discretionary 
security policy enforcement is added to the basic kernel 
functions of providing segmentation, multiprogramming and 
interprocess communication. The system is distributed both 
logically and physically; logically, parts of the operating 
system are distributed within the address space of each host 
system. The use of protection domains permits the operating 
system to maintain its integrity while interacting with the 
Fost system({i]. Physical distribution of parts of the 
system, when used in conjunction with shared data bases, 
eliminates the dependence on a single ccentrolling wunit 
(master CPU). In addition, an increase in performance may be 
realized by reducing potential bus contention when executing 
common code. 

The SASS environment can now be defined by examination 
of the four levels of abstraction shewn in Figure II-2. 
Level 3 is the Host computer systems: all that is known by a 
Fost system about the next lower level is the virtual 
machine interface provided, specifically a set of five 


instructions: create, delete, read, store and modify files. 
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The Host systems thus interface S455 at level 2, the 
Supervisor. 

The Supervisor {level 2) marks the begining of the 
operating system code. The operating system consists of two 
domains: the Supervisor and the secure Kernel. The 
Supervisor operates in the less privileged of the two 
@omains. The function of the Supervisor is to manage the 
input/output protocol with the Host systems and maintain the 
meerarchical file structure established for each Host 
System. Two processes, the I/O process and the File Manager 
process, are created for each Host system at system 
initialization. Communications protocol and data packaging 
is accomplished by the I/O process. All commands received 
and actions initiated are coordinated by the File Manager 
process. Functions provided by the File Manager inciude 
management of the Fost’s virtual ?ile system and the 
enforcement of the discretionary security policy. It should 
be noted at this point that both levels (2 and 3) exist ina 
mean Virtual environment: all resources are virtual. 

At the interface between the Supvervisor (level 2) and 
the Kernel (level 1) is the Gatekeeper. All that is known of 
the lower levels to the Suvervisor is a virtual machine with 
an extended instruction set. The virtual machine is the 
Meetrricteéd subset for hardware instructions and the extended 
instruction set facilitated by the Gatekeeper. The primary 


Objective of the Gatekeeper is to isolate the Kernel and 
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make it tamperproof [7]. The gatekeeper establishes the 
logical boundary between the Supervisor and the Kernel. As a 
matter of course it provides a single software entry point 
‘enforced by hardware) into the Kernel (level 1). 

Level ise the Security kernel, consists of two 
Componerts: the distributed kernel, which logically resides 
in the address space of each Host system; and the 
Mon-distributed kernel. The distributed kernel consists of 
the Segment Manager, the vent Manager, the 
Non-discretionary security module, the Traffic Controller, 
the Inner Traffic Controller and the distributed Memory 
Manager module. Two modules, the Event Manager and the 
segment Manager comorise the extended instruction set 
contained within the eatekeeper and available to the 
Supervisor. The Segment Manager provides segmented virtual 
Storage management, and the Event Manager provides 
mever—-process communication. 

Ome Traffic Controller is really comprised of two 
modules: the traffic controller listed above, and the Inner 
Traffic Controller module. Binding or ‘mapping of Host 
processes to virtual processors is accomplished by the 
Traffic Controllers binding of virtual processors to real 
Me cosOrs is the function of the Inner Traffic Controller. 
This two-level traffic controller supports the loop-free 
module hierarchy for a general kernel-based operating system 


design (see earlier section on kernels). It also supports 





use of local memory since the Inner Traffic Controller 
manages the virtual processors of e@ach separate CPU. 

The Non-discretionary security module enforces the 
mon-discretionary security policy, as the name implies. The 
remaining module, tae distributed Memory Manager, 
facilitates inter-virtual processor communications for 
synchronization between the Segment Manager (virtual storage 
management) and He non-distributed Memory Manager that 
provides storage virtualization of local/global memory and 
secondary storage. 

The non-distributed kernel consists solely of the 
non-distributed Memory Manager just discussed. This Memory 
Manager exists as a kernel process (an operating system 
function placed outside the distributed kernel, as described 
earlier), permanently bound toa virtual processor and in 
competition for the physical processor resources managed Dy 
the Inner RudtsaiscmecOny roller. figure TIi~3 shows the 
two-level traffic controller desivn and the vehicles for 
resource virtualization (both processor and storaze). 

Level @ is the system configuration on which the kernel 
mums. It consists of the full hardware configuration and the 
mad Structures describing this environment. A discussion of 
the full hardware configuration is presented in the 
following section. The data structures describing the 
environment must contain complete information about the 


physical resources to be managed by the kernel, and ina 
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mempatible format. A further discussion of the system 


configuration will be presented at the end of this chapter. 


D. HARDWARE ARCHITECTURE 
1. Hardware Pequirements 
Within the SASS design goals as defined by [5]. the 
concept of resource virtualization is the key to the design 
mars of internal security, configuration independence, and 
a functional sub-setting capability. Resource virtualization 
separates higher levels of the SASS design from the bare 
hardware in such a way as to permit implementation of these 
Mesiegn goals. Hardware requirements must then be based on 
those hardware features eee Suvoor t resource 
Merevualization. 
a. Processor Virtualization 
4 virtual processor is the software 
representation of a ovrocessor that may be functionally 
differert from the actual, physical processor upon which it 
Will run. Processor virtualization also defines a number of 
logical processors that are data structures that contain a 
complete description of processes at a certain point of 
execution on the physical processor. In the instances where 
the physical and virtual processors are functionally 
identical, virtualization serves only to multiplex processes 
(multiprogramming). In either case, hardware requirements to 


mepport processor virtvalization are those architectural 
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features that can be used to bind virtual processors to 
Mumeuca) processors, ina state of execution, viz., for 
context switchine. 
Dea VeNOn Yanan Ghadeezation 

Memory virtualization requires the management of 
Oeamary and secondary physical memory resources to create 
the illusion of a primary nemory which is independent of the 
actual physical primary memory. This illusionary memory is 
called Ties virguaee Storage. The loeical, relocatable, 
information objects created by memory segmentation, provide 
an essential memory multiplexing mechanism for the efficient 
implementation of virtual storage [5]. Memory segmentation 
also provides a convenient mechanism by which address spaces 
may be defined in the creation of processes. Hardware 
architectural features that provide for memory segmentation 
and support enficient memory virtualization in a 
multiprogrammineg environmentare desirable in the SASS 
design. 

©. Protection Domains 

A key concept for the implementation of the 
internal security design goal is protection domains. 
Protection domains arrange process address spaces into rings 
of different execution domains, exhibiting a hierarchical 
layering of privileges. In the virtual processor sense, the 
execution points of processes become more restricted in less 


privileged domains. The hierarchical ring structure supports 
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processor virtualization where PeOcesicets Wa G21 0 each 
successive ring run on a virtual processor that is more 
restricted than its base machine. 

This ring structure is an hierarchy in which the 
most privileged domain is the innermost ring. The structure 
divides the address space into levels of abstraction with 
strictly enforced gates at the ring boundaries (5) 
Protection rings may be created in software, but an hardware 
implementation, where gate use is enforced by hardware, is 
much more efficient [14]. Hardware features that restrict 
“privileged” instruction set usage within the physical 
Processor support a two domain ring Structure and thus, can 
efficiently serve to implement protected domains. 

ee Hardware Selection 

The hardware architectural features described above 
- processor and process multiplexing support, a memory 
segmentation Capability, multiple domain memory 
Mertvitionineg, and nultiple domain instruction set —- are 
essential to an efficient implementation of the SASS. The 
Zilog 28000 family of 16-bit microprocessors with an 
architecture which supports memory segmentation and 
two-domain operation was selected as the target machine 
because of its robust support facilities and close match to 
design requirements. Further, it was selected because of its 
commercial availability as an moet ne—-sneit , single board 


package for multiprocessor applications and Multibus (INTEL 
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trademark) compatability. The later feature was deemed 
necessary late: accommodate a more flexible range of 
peripherals. 

The segmented Z8@@1 microprocessor is a register 
oriented machine, with sixteen 16-bit general oburpose 
registers, seven data types (from bits to 32-bit lone 
words), and eight user selectable addressing modes. With the 
use of the Zilog Z8918 Memory Management Unit (MMU), it can 
directly access 8 megabytes of memory. A more detailed 
description of both can be found in reference [27]. The 
Z80@1 hardware was not available for use during system 
development, and unfortunately, neither was a commercially 
available, ZEOB1L single board packaging for SASS 
architecture implementation. The actual hardware used in the 
developmental system implementation was the Advanced Micro 
Computers Am96/4116 Monoboard Computer with the 
non-segmented AmZ8¢@62 microprocessor. 

The Am96/4116 Monoboard Computer provides necesary 
processor virtualization features. With its Multibdus (INTEL) 
Interfacing a processor-to-processor interrupt capability 
through the Multibdbus is available. This interrupt becomes 
the non-vectored interrupt ‘NVI) source to the Z8@0@2. Local 
and Zlobal time-slicinge capability with interrupt is 
provided by onboard clock circuitry, that also maintains a 
continuous real-time clock. The above features are useful 


for effective nultiprocessing synchronization. 
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The Multibus also allows for elobal memory in a 
multiprocessor environment. Global memory separation in 
blocks of 32K bytes is facilitated by five of the 2@-address 
lines provided by the bus and decoded processor signals. 
Local memory separation is effected by onboard wiring 
additions (see appendix A) which allow system mode only 
access to a portion of local onboard memory. Alterations 
were accomplished with the use of available onboard gates 
and soldered connections. This memory partitioning affords 
both kernel code protection and secure process switching in 
this two domain environment. The more general nature of 
memory segmentation design has been preserved by software 
‘simulation of the MMU hardware. 

oe ocAcs Developmental Architecture 

The general hardware architecture of the SA4SS5 
consists of multiple processors, each with its own local 
merory and two memory management units, a single system bus, 
global memory, addressable by all processors, multiple 
peripheral interface capability, and a large archival 
Storage aggregate. As shown in Figure I[I[-4, an arbitrary 
number of Z89@0 processors and peripheral interfaces (hosts) 
are assumed, with all components sharing a common system 
bus. This general architecture is in keeping with the SASS 
design goals and the Giardeterist les of the 28008 


microprocessor family. 
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As a prelude to the actual initialization mechanism 
development, and to support ongoing and concurrent research 
on the SASS project, the final hardware architecture was 
integrated with thre existinz Ziloz 2686000 development MCZ 
System hardware, that has to date been the principle 
development tool. The Zilog 28800 Developmental Module (DM), 
a Z80@@ based experimental board at the heart of the system, 
has no provision for system bus interfacing and therefore 
does not meet the SASS design goals. The DM was replaced 
with the Am96/4116 Monoboard which does have the system bus 
interfacing cavability and facilitates the SASS design goals 
of global memory, multiprocessors, and archival storage. 
Once the new single board system was integrated with the 
remaining portion of the original developmental system, 
namely the Zilog MCZ-1/985 Z83 tased microcomputer system, 
the physical implementation ot the SASS hardware 
architecture could be realized. The SASS Developmental 
Architecture evolved as shown in Figure II-5. A detailed 
description of the system can be found in reference [27]. 

Two Am96/4116 Monoboards are currently housed in an 
INTEL ICS-8@ chassis, which provides the power supply, 
cooling fans, and the Multibus backplanes. Zach Monoboard is 
wire-wrapped to specifications listed in reference [27] and 
integrated into the architecture as shown in Figure II-5. In 
order to reduce the complexity in physical memory addressing 


for the non-segmented Z8@92, global and local memory sizes 
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are assumed equal and indeed implemented in this manner on 
the Monoboards. The Monoboard assumes a 52K byte block of 
local memory and a SZK byte block of offboard memory, with 
mts 16-bit address bus. 

To reiterate, the local 32K RAM onboard memory has 
been configured for system mode only access from absolute 
addresses @OGO-3FFF FEX; addresses 4000-7FFF HEX are 
accessible by both normal and system modes. In addition, 
attempted access of memory below 4000 HEX while in the 
normal mode will generate an onboard vectored interrupt. 
Global memory is addressed from @000-FFFF HEX on the 
Am96/12@@ RAM Memory Boards interfaced to the Multibus. 
Currently, three blocks of 32K are utilized and partitioned 
as normal mode only, system mode only, and normal mode/read 
only accesses. The read only access RAM must be supvorted by 
a direct memory access (DMA) capability in the archival 
Storage device, which has not to date been implemented. DMA 
Capability allows secondary storage to write to primary 
memory directly, thus providing the necessary access to load 
the processor read-orly memory. 

Assembly language programming (PLZ/ASM) ‘for the 
28600 is provided by the Zilog MCZ system software support 
and an Upload/Download capability with one ai the 
Monoboards. Program development functions are provided by 
the SASS Developmental Monitor; a complete program listing 


and command syntax description can ve found in reference 


E8 





(27). Basically the S4SS monitor operates in one of three 
modes: the transparent mode, in which the Z8@@@ processor 
acts in arelay capacity between tne MCZ system and the 
Jevelopment terminals the Upload /Download mode, which passes 
developed 2Z28@@? programs from the MCZ system to SASS system 
memorys and the typical monitor mode, for program running 
and debugging. ‘dditional features to support initialization 
are included and will be discussed later. 

Archival storage devices can be almost any available 
technology, e.@., magnetic fixed or floppy disk systems, 
optical disk systems, or magnetic tape. Storage capacity, 
interfacing and timing are the basic considerations of 
device selection for mass storage fat se lalatiel the SASS 
implementation. AS previously mentioned, no storaze device 
has to date been implemented. 

An additional architectural feature not previously 
moeussed or shown in the SASS design is the ability to 
mommunicate directly with the SASS, not as a peripheral or 
host system, vut as the SASS operator. Communications, such 
as occur during system initialization, must occur through a 
terminal device and some program transfer media. An archival 
Storage device may fulfill the later requirement provided 
external portability exists i‘viz., a disk that can be 
removed and transported). This link would most probably be 
effected through a dedicated peripheral interface. A single 


parallel port is provided on the same Monoboard for the as 
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met, undefined Host system. Futrre work on the developmental 


Host interfacing will use this architectural feature. 


E. SUMMARY 

With the system environment adequately defined, the 
initialization evzoals can be identified. These goals appear 
to fall into two categories: (1) hardware initialization 
leading to a fully functional hardware configurations and 
(2) initialization of the full system configuration on which 
the operating system will run. 

mmoeessor initialization in the SASS architecture 
requires that a program-status area (PSA) be estadlished, 
properly loaded, and the PSA pointer register loaded to 
point to that area. Secondly, a stack area must be allocated 
and the default (normal/system) stack pointers set 
appropriately. Thirdly, all necessary interrupt/trap service 
Mmoutines must be made available and identified in the PSA; 
initialization of external devices nust be performed; and 
mmesappropriate interrupt structure enabled. Initialization 
of the full hardware architecture involves establishing an 
environment of co-operating processors where a system-wide 
knowledge of physical resources is Known. 

imetidglization of the full system configuration ocr 
‘bare-machine” on which the SASS will run, consists of 
making available to the operating system the knowledge of 


the hardware configuration, which includes all physical 
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resources and any ovore-allocations such as the PSA and 
stacks. Resource knowledge must be consolidated into a form 
suitable for resource management. The manner in which these 
initialization goals are accomplished is the subject of the 


rext chapter. 


for 





IV. DESIGN IMPLEMENTATION 


This chapter describes the implenentation of the 
Miitialization ‘design for the SSS Developmental 
architecture. First, a discussion of the implemertation 
objectives is presented, followed by an explanation of those 
restrictions peculiar to this effort which are imposed ine 
to either hardware constraints On circunscances. A 
description of each of the program components follows next. 
Further, the Bootload program and the Bootstrap program 
‘contained in reference(27]) are examined in detail. 
Afterwards. a discussion of how these programs relate to the 
operating system core image when considering run time 


initialization, is presented. 


Pee OLJECTIVES 

The primary objective of this implementation is to use 
the design methodology presented in this thesis to effect an 
initialization mechanism that will produce a runnine SASS. 
The imolementation must be able to initialize the current 
demonstration package as well as the final version of the 
SASS, while working within the developmental architecture. A 
clear choice between using the develovmental environment oor 
running the SASS must be given to the operator. Using the 


developmental system requires the initialization of a 
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monitor that provides the operator with the developmental 
tools he needs. Currently, running the SASS involves running 
the demonstration pacxage since the operating System 
implerentation is not completed. 

A secondary objective requires that the design 
methodology produce flexible and versatile bootload and 
pootstrap program structures and algorithms for general 
mapbication. The bootload program, which comprises the 
firmware, must be readily adaptable to many hardware 
architectures. The bootstrap program should accomodate a 
wide range of monitor or kernel based operating systems. In 
addition, each component must be independent of the others 
except for the dynamic parameter passing mechanism. 

Since the current SASS demonstration package is not yet 
functional in a multiprocessor environment, a Separate test 
program must be created to demonstrate the initialization 
mechanism with multiple microprocessors. These programs will 


be loaded and run in place of the operating system. 


ee RESTRICTIONS 

several hardware related constraints are placed on the 
implementation. One restriction, which significantly affects 
dynamic resource determination, is the requirement for 
Memory Management Unit (MMU) simulation. The use of wiring 
modifications for local memory protection and hardware 


domain signals to partition global memory has complicated 
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the experimental method of determining memory resources. In 
a hardware supported segmentation environment using MMU’s, 
this is accomplished in a straight forward manner by a 
read/write determination of the full physical address space; 
domain signals are not used for memory segregation. However, 
in the current hardware architecture using domain signals to 
partition memory, nemory resources must be experimentally 
determined in both the system and normal modes. 

EFrecution of the code for mapping memory in tne normal 
mode must reside in a portion of memory accessible in the 
normal mode. This means if the code being executed prior to 
Switching to the normal mode is currently in an area 
accessible only in the system mode, it must be moved to an 
area addressable in the normal mode. Since a more logical 
choice would be to erecute any code which contains mode 
Switching in an area accessible by both modes, the 
implementation choice was made to “fix the location of any 
code moves at system generation time. This restriction is 
Strictly an implementation dependent detail derived ?rom a 
Simulation environment to start with, and will te discussed 
further under the bootstrap program section. 

A major hardware restriction effecting the overall 
appearance of the programs but not their functions, is the 
requirement to use the MCZ microcomputer development system 
as the secondary storage device to contait the bootstrap and 


Operating systems core images. Effecting a bootstrap loading 
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of these core images is significantly more involved than 
miailizineg @iise controller ({(i.e., heard disk) oprimitive 
operations. 

The MCZ ZILOG RIO operating system makes use of a 
packet-passing protocol (Tectronix format) [24] in order to 
effect upload/download operations through its serial port 
connected to the MonoBoard. This packet-passing protocol 
requires more coding than fy D1iCal disk controller 
primitives. A suitable secondary storage device was not 
available during the development of this implementation. 

This increase in the amount of code also led to another 
implementation choice. While it is true that a developmental 
monitor program is actually just another operating system, 
the conventional manner of storing the program is as a part 
of the firmware. FOM size restrictions when coupled with the 
increased amount of coding mentioned atove, and desire to 
adhere to the design methodology, led to storing the 
developmental nonitor on secondary storage (i.e., MCZ). 

Qe should be roted that the secondary storage 
interfacing programs reside within the firmware, and as such 
are a hardware dependent feature. They nust be added to the 
bootload program when adapting it to a specific hardware 
mmenitecture. | 

One software restriction which pertains has been 
discussed earlier. The source code should be programmed 


without any absolute addressing. Some form of relative 
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addressing or base fed addressing should be used. This 
allows the code to be volaced at any location in addressable 
memory and to execute properly. This holds for the. bootload 
program as well, which may reside in a PRCM that may be 
“hardwired to any physical address. Programs and program 
marae structure core images must then be created to run at 
absolute address zero to facilitate base indexed addressing. 

The resultant implementation choices led to generation 
of the bootload program for ?irmware, the bootstrap program 
for bootstrapping the SASS, and the monitor program for the 
SASS Developmental Monitor. A discussion of the monitor 
program is contained in reference [27]. 

Figures IV-1 and IV-2 orovide an overview of the 
bootload program control flow. Figure IV-3 overviews the 
bootstrap program. In this implementation the bootload 
program is composed cf two main modules and three support 
modules. The support modules serve only to interface 
secondary storage. The bootstrap program consists of one 


main module and the same three support modules. 


C. BOOTLOAD PROGRAM 

The system initialization mechanism was designed to 
commence operating once power is applied to the system, or 
as is the case with the RESET switch, power is interrupted 
momentarily. This in turn causes e@ach processor to acquire 


its initial execution point from within its initial address 
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meaqee in the firmware. This address spac@ contains the 
bootload program. Bach processor will have its own 
serialized firmware, as stated previously. For the Z&8200% 
microprocessor the Flag Control Word (FCW) portion of the 
initial execution state is obtained from address @@82 HEX 
and the Progran Counter (PC) from address 20@4 HEX. Bootload 
program execution will begin with this defined processor 
state. 

At this point the processor is working in an environment 
without RAM and with no knowledge of any other processors. 
This marks the start of the Independent Processor Stage. The 
bootload program listing is contained in referencef27]. 
Figure IV-1 shows the independent processor and local 
initialization stages that comprise the first bootlcad 
module. 

1. Independent Processor Stage 

The tasks of the independent processor stage 
corsists of “clearing memory and defining primary storage. 
It has been found in this implementation that the scribing 
Operation of the cooperating processors” tasks can be more 
efficiently performed concurrently with the clearinz2 and 
defining operations. 

Each processor vbegins vy clearing memory in the 
System mode by writing a pattern (55AA) at the begining of 
each block of memory and writing zeros to the next five 


locations. This read/write pattern was selected for reasons 
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described in chapter II. The size of the block was chosen to 
be 800 HEX which is the smallest size of ROM supported by 
the Am96/4116 MBC hardware architecture. Using this method, 
a routine traverses the physical address space clearing 
memory. After this is accomplished, each processor waits 
approximately 2 milliseconds, enough time for any other 
processors tc complete the same tesk. 

Next memory is scribed SCElea MEM) by each 
processor by use of the bus locking mechanism. At the 
conclusion, each processor again waits for any other 
processors that might exist to complete the task. Each 
processor is still operating independently. 

The DEFINE MEM routine nakes use of the results from 
the clearing and scribing operations to determine the 
addresses of the lowest blocxs of local and global memory. 
The results of the scribing operation distinguishes local 
and global memory by establishing access to memory by more 
than one processor. fdditionally,. the highest scribed value 
ottained during the searching of memory becomes the number 
of intercommunicating processors in the system. For the 
purposes of this implementation, the cooperating processors 
and local initialization stages will rely on the contents of 


these dedicated registers: 


R5 = number of processors 
R6 = highest local address 
R? = lowest global address 
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Notice that all code execution has been sequential (viz., no 
calls or returns) thus far due to the undefined stack. At 
this point, available read/write memory for the system mode 
is Known. For the single processor case however, e2lobal 
memory is uniefined and must be “set statically. This can 
either be done by atsolute addressing, as in this 
implementation (&@00 HEX), or by adding an appropriate value 
me the low local address, to achieve separation for 
Mermaplishing the configuration table. For ease o? coding, it 
momoesirable to initialize the internal CPU register used as 
mestack pointer in order to facilitate procedure calls. It 
is for this reason and the desire to keep related functions 
@ontieuous, that the local Malt all Za hi0n stage is 
accomplished next. 
eee Focal Initialization Stage 

Having an address of accessible memory, certain 
internal, special purpose C?U registers can de set. These 
registers are used as pointers into this addressable primary 
memory space to establish the system mode stack and the 
program status area (‘refer to reference [27] for a 
discussion). The stack pointer is implicitly assigned to 
register R15 for the non-segmented 28000. Setting R15 to the 
meaecmnare€a contained within the first 124 HEX of the low 
meeai address block, facilitates the use of CALL and RETURN 
Mectructions. Since all programs produced at system 


generation time can be located at any physical address, a 
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dedicated CPU register ‘F12) must be set to the base address 
of the current program location to allow for the base 
indexed addressing mode that is used to acheive 
relocatability. All procedure and label addresses used 
during stack operations and control branching are derived 
mepom offsets applied to this dedicated code address 
meeister. 

4 similar method is applied to variable references 
Metnin the code since the address area containing these 
variables is defined dynamically. The data structures used 
by the various programs are created as templates at system 
generation tine at absolute address zero, and base indexing 
with a dedicated data address register (814) is used in 
referencing the variables. Once this register is set, the 
precessor is no lorger ir a variable free environment. 

4 Z280@@ required data structure is its effective 
interrupt/trap jump vector, or its program status area. Now 
nmaving available memory, the oprrgram status area can be 
established by setting the program status area pointer 
Besar). 

fa thewcurrea, implementation the “high block of 
Moca memory addressed by register R6 is allocated for the 
Stack, the program status area, and the data area 
(variables). The stack area is allocated from 87@0-O7FF HEX 
with the stack pointer :815) set to @7FO HEX relative to R6; 


the PSA is assigredi from 3609-2700 HEX with the PSAP set to 
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address 9600 HEX relative to R26; the CODE AREA register Ric 
is set dynamically to the base of the bootload program 
Moecation: and the DATA ‘AREA register Ri¢d which facilitates 
the use of variables, is set to address @20¢€ HEX relative to 
R6. 

Included in the local initialization stage is the 
actual MiPtralization or the Single ovrocessor data 
eeructures. This is collectively called software 
initialization. The data area pointed to by 814 is cleared; 
variable storage areas are brought to a known states and 
those variables requiring initial values are appropriately 
set. To enable input/output communications with the console 
and with the MCZ system (if connected) for upload/download, 
the input ring buffer must be filled with spaces (blanks). 

iieers"sis “ieatidlized to enable interrupts for 
communications through the terminal or with the MCZ system. 
All FCW’s for the PS! are initialized to the system mode 
-40€0') to disable additional vectored interrupts until the 
current handler routine is completed. The console port and 
MmGZ port interrupt handlers ‘procedures CONINT and MCZHUND) 
are set in the PSA to allow communications; and the 
non-maskable interrupt ‘NMI) is set as a way to return from 
the transparent mod2 when initializing the MCZ system. In 
the transparent mode, the bootload program loops, passing 


data between the console and the MCZ system. In snort, those 
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handlers initially set in the PSA are to enable that I/0 
meron facilitates system initialization. 

Those Single processor tasks which initalize 
hardware devices external to the CPU are grouped into the 
Bow INIT routine. Components initialized here include those 
hardware devices without which no I/O would be odhysically 
possible. For example, the Interrupt Controller (82594) chip 
on the MonoRoard and the serial port USARTs (9551) which 
service the console and MCZ systems are initialized first. 
As a matter of convenience, other hardware components though 
rot necessary to continue initialization, are initialized in 
the EDW INIT routine as well. In this implementation the 
Mempene Controller (9513) IC is also initialized to provide 
meee Clocks, counters, and software interrupt sources to be 
used later by the system. 

The last and probably most important task of the 
meeat initialization stage is to physically permit the I/0 
communications that have been provided for, vy enabling the 
vectored interrupts (i.e., console and MCZ ports). When 
finished with the local initialization stage the independent 
processor has a data area for variable storage, an interrupt 
driven I/0 communication mechanism, and all hardware in an 
initialized state. 

5. Cooperating Processor Stage 
Actually some of the tasks that comprise the 


cooperating processor stage were performned concurrently with 
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those of the independent processor stage as described 
before. They began with the first locking of the system bus 
during the scribing operation. At this point in the bootload 
Peoeram, the number of processors and the existence of local 
and global primary memory is known. The range of the local 
and global memories is yet to be determined. This is 
accomplished by mapping memory. Each processor must map its 
own known memory space in a coordinated fashion and provide 
mars information to the system. 

fo ee lowe fOr Vinter—-processor communication the 
configuration table is implicitly assigned in the low global 
memory block as pointed to vy register H7. AS can be seen Dy 
the configuration tabdle data structure in Figure [V-2, the 
read/write pattern location and CPU count used during the 
Scribing operation, for both the nornal and system modes, 
are incorporated into the tabdle. This allows Por 
preservation of the clearing and scribing operations already 
performed. The configuration table proper begins with the 
Baole locx. 

The table lock provides the mutual shed DiS le ysl 
mechanism for controlled sharing of the table. The next word 
mecation is the CPU count which is to be used by each 
processor to determine its logical CPU number, and by the 
bootload CPU to “count processors” responses. All 
individual processors” entries ir the table are contained in 


mre CFU list entry. 
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CONFIGURATION TABLE RECORD [ 
R/W_ PATTERN WORD 
CPU_NUM WORD 
NO2M R/W PAT WORD 


NORM CPU_CNT WORD 


TABLE LOCK WORD 

CPU_CNT WORD 

Gee bls T ENTRY ARRAY ] 
ENT?Y ARRAY ARRAY [ MAX_CPU CPU_ENTRY ] 
CPU ENTRY RECORD [ 

SIGNAL WORD 

CPU_ID WORD 

MSG BLK MESSAGES 

MEM MAP MEM ARRAY ] 


Contirzuration Table Declaration 
¥Wigure [V-2 
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A pbootload CPU must now de determined. Each 
processor attemvts to e2@ain access to the table by setting 
mee table lock. To do this each processor performs a 
test-and-set operation utilizing the bus locking mechanism 
moninsure the integrity of the operation. The first attempt 
to leck the system bus by each processor create a race 
condition, the winner of which becomes the first processor 
to access the table; and thereby becomes logical CPU @ and 
the bootload CPU. 

After gaining access to the configuration table, 
each processor checks tc see what its logical CPU number is 
meameeme GeU count entry. If it is @ then the processor 
becomes the bootload CPU; otherwise the processor becomes a 
member CPU. A differentiation of orocessors has now been 
made and each branches to the appropriate secticn of code. 
Meeure iI¥-S contains the algorithn for the cooperating 
processors stage that comprises the second bootload module. 

a. Bootload CPU 

The  bootload CPU increments the CPU count to 
indicate the next processor’s logical CPU number. Next it 
will bring the CPU list entry in the table to a Known state 
by clearing enouvah CPU entry blocks to accomodate all known 
processors (R5). After clearing the table entries, the 
bootload CPU procedes to make its own entry in the table as 


Meei1cal CPU 4. 
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NO LO 
MEM CPU = BOOTLOAD CPU 






CLEAR CONFIG TABLE _ 


ENTER CPU ID, MEM MAD 
INCREMENT CPU CNT 
UNLOCK CONFIG TABLE 






WEIT FOR SIGNAL WATT FOR &§Li CPU's 


SIGNAL OTHER CPU ‘s. 


PROMPT CONSOLE 


[WAIT FOR SIGNAL 
OR CONSOLE INPUT 


ES CONSOL® NO 
INPUT? 


PROMPT MCZ RESET 
CONSOL 


ee Ul 
= “S,? 







LOAD MONITOR 
YES 


LOAD BOOTSTRAP 
SIGNAL TOANSFER 


DISABLE PROM 
BEGIN MONITOR 


r -“ 


BEGIN BOOTSTRAP 


Bootload2 Module 
Figure IV-3 
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heecrU entwyerconsists of a signal word, a CPU ID 
word, a three word message block, and the actual memory map. 
The signal is used by the bootload CPU to inform the member 
CPU’s that it has placed a message in their message block’s 
and that the next seauential action can now be performed. 
The CPU ID entry is where each processor enters its own 
unique idertification number. The memory map is a byte map. 
of memory blocks for toth the system and normal domains. 
Note that the size of the CPU entry is fixed at system 
eeneration time when the block size is determined. The size 
or the COntwamrda tion table however, 1s dynamically 
determined at runtime. An upper vound of the size of the 
table is fixed for programming convenience. 

The vootioad CPU after having entered its own 
meraue identification number into the table, procedes to map 
its ohysical address space for the system mode, making use 
of the ‘clearing and scribing results from before. Memory 
mapping is performed by the M&P_MEMORY procedure which 
Semstructs the map with a call to the system maoping 
procedure. Each memory block is mapped with a word; the high 
byte represents accessibility in the system mode and the low 
byte represents the nermal mode that will te mapped later. 

The system mapping procedure makes an access 
determination for each memory block. If a given memory block 
1s accessible, as indicated by the presence of the 


read/write pattern (xeep in mind that a read only access is 
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defined by an instruction fetch operation and not a data 
fetch as is performed here), it is then designated according 
to the number of processors having access, as indicated by 
the scribe location entry. If only one processor has access, 
as indicated by scribe location equal to one, the block is 
designated as local memory in the map by a °O1° entry. If 
scribe location is equal to the total number of processors 
known to the system, the memory block is labeled as globdal 
(°@2°). Access by a number of processors greater than one 
but less than the total, defines the block as non-usatble 
(“74°%, All memory blocks not containing the read/write 
pattern are not accessible and designated °85°. The ‘83 
designation will be used later during the normal mode 
mapping operation to indicate access in both the normal and 
System martes. 

After having made its own memory map entry into 
the table, the bootload CPU unlocks the table to allow 
access by the other processors. It then waits for all member 
CPU“’s to make their entries; this is indicated by the CPU 
count which is incremented by each processor after it has 
made its entry in the table. 

b. Member CPU 

Each member CPU procedes to obtain its logical 
CPU number from the CPU count and to compute the base 
address of its entry. It will next make its own entry: first 


its uniaue identification number and then its own memory 
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map. The member CPU then increments the table CPU court and 
uniocks the table to allow access oby the remaining 
meocessors. 

Bach member CPU will now wait for the signal 
meome che bootload CPU to continue. 1 slight delay between 
Sienal checks is added to reduce contention for the system 
bus while making global memory accesses. 

Co, LoOOGsSuran Loadine 

The normal sequence of events within the 
bootload program would be to have the bootload CPU load the 
bootstrap program and then signal a transfer of control by 
peep rocessors, out of firmware and into tke bootstrap 
program. YJowever, in this implementation, only one processor 
can be connected to the MCZ microcomputer system being used 
for secondary storage. To preserve the generality of the 
design, the bootload C°U in this implementation was not 
‘forced to be that processor attached to the MCZ system. It 
is therefore necessary at this point to determine which 
processor is connected to the MCZ and allow that processor 
to effect the bootstrap program download and transfer of 
mom~rol by all CPU’s. 

An initial prompt ‘*) is sent to tne conscle 
port of each processor to signify that bootload operations 
have been completed (except for bootstrap loading) and that 
I/O is now established. Each processor then enters a program 


moompmmwaiting for either an input from the console or a 
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meenal word entry in the configuration table. An input from 
the console designates the processor attached to the MCZ 
system. A signal word entry signals a transfer of program 
control to the bootstrap program entry point contained in 
the first word of the message tloc«. 

Pnemprocessaueewirch 5) attached to the MC2Z 
system becomes the bootload coordinator to effect the 
fiowniload of a program from the MCZ. The bootload coordinator 
first insures that the MCZ system is initialized by sending 
the prompt “RESET MCZ° to the console for operatcr action, 
and entering a transparent mode where all further console 
entries are relayed to the MCZ system and visa versa. To 
exit the transparent mode the NMI interrupt that returns 
program execution to the NMI_RTN point is used. 

Poco tT «Ne FOOL LoOad “coordinator must 
Mermit the console operator the choice of running the SASS 
Sevelopmental Monitor or the SASS. An “S” entered from the 
console signifies a bootstrapping of SASS, while any other 
entry denotes the developmental system. In the former case, 
the vbdootload coordinator effects the downloading of the 
bootstrap program; in the latter case, the SA0 0 
Tevelopmental Monitor is downloaded. A single processor 
system is assumed when establishing the developmental 
System, with the other processors not being loaded with the 
monitor. The monitor program is written “under the shadowed 


firmware and the PROM’s must be disabled before execution of 
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the monitor can begin. A procedure to disable the PROM moves 
the cofe outside the firmware into local memory ard effects 
a transfer to the code, that disables the PROM’s and 
transfers to the monitor program entry point. In the case 
where the operator selects SASS loading, the bootload 
coordinator loads the program and signals the transfer of 
control to all other processors by using the signal word and 


message block entries cof each processor. 


D. BOOTSTRAP PROGRAM 

The bootstrap program performs the hardware resource 
knowledge consolidation and operating system loading. This 
is accomplished in two stages: the global initialization and 
the core image load. In this implementation the complete 
mapding of memory requires mavping in the normal mode as 
well. The normal moce mapping could not have been performed 
at the same time as the system mode mapping since the PSA 
was undefined at that time. The PS!i is required for the use 
of the Z8090 system call (SC) instruction which fecilitates 
moae switching, more specifically the switching from the 
momen to the system mode. The SC instruction causes an 
internal eau trap wre the program status area for 
determination of the trap handler which is executed in the 
system mode. 

The start of the bootstrap program, as seen in Figure 


IV-4, performs the normal mode memory mapping. As mentioned 


92 





| RYGIN BOOTSTRAP 


MOVE CODE TO 
N/S ADR SPACE 




















CLEAR, SCRIBE 
NORMAL MCDE 


LOCK CONFIG TABLE 







ENTER NORMAL MEM MAP 
INCREMENT CPU CNT 


UNLOCK CONFIG TABLZ 


SET CPU_CNT = @ 










WAIT FOR SIGNAL 


SO nOnmOntens CPU 
._ LOAD KERNEL 


DOWNLOAD KERNEL 






WAIT FOR SIGNAL 


BEGIN SASS 


Bootstran Module 


(2h OB ea 
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earlier in the chapter, the code for the normal mode mapping 
must be moved to a fixed local area of memory accessible 
Bers in the normal and system modes. The inability to 
dynamically determine an area to which the code may be 
moved, without first executing the code, creates a difficult 
Seouation; hence, the code relocation address is fixed at 
eestem generation time. This situation is a product of the 
meerent architectural restrictions. Given a suitable 
secondary storage device with DMA capability, and the 
already existing global read-only (code) memory the dilema 
no longer exists. The bootstrap program would simply be 
loaded by the secondary storage device into this read-only 
Zlobal memory and execution of the code could proceed 
Sequencially in both domains without any relocations. Of 
course with the segmented architecture nemory mapping within 
domains is not applicable and the code may be removed 
entirely. 

The code is moved to a common system/normal node area of 
mocal memory and control is transferred. In the normal mode 
‘less privileged domain), I/O instructions and instructions 
the change CPU special purpose registers can not be 
eeeeecuved; therefore the system call instruction is used to 
effect execution of certain of these privileged instructions 
while in the normal mode. In particular, the bus lock and 
mer o'c K instructions ({(I/0) and the switching of modes 


(Changing the FCW register) are required. The first section 
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of code ‘clears and scribes remory in the same manner as 
used for the system mode. 

Se this point the bootlead CPU must again be determined. 
Bach processor attempts to gain access to the configuration 
tatle and when successful, compares its own logical CPU 
number (passed aS a parameter from the Dbootload ovrogram) 
meee the CeU count. If it finds amatch, @ check for logical 
feos number @ <‘bootload CPU) is made; the btootload CPU 
resumes its role and the member CPU’s resume their roles. In 
the same manner as for the system mode, the bootload CPU 
maps and erters its own normal mode memory accesses. It then 
Supervises the same task for the remaining member CPU’s. 
Again the mapping code must be moved by each processor. to 
the pre-defined local address location (4100 H=X) that is 
accessible in vdoth the nermal and system modes. 

GeeeGlobal Initialization Stage 

Piece eoal. initialization stage tasks Enedde 
consolidation of the individual processors” memory maps to 
form a system map having global memory knowledge and the 
creation of a Mom ecam—to-yaysicals CYU madeasin this 
implementation, no further formatting of the resource 
knowledge is performed since the processes within the SASS 
that use the knowledge have not been inplemented. In more 
peneral applications, this section of the code may contain 
furtner processing of this resource knowledge into a 


“standard form. 
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The resource knowledge now available includes: a 
meaue identification for each system logical CPU; a local 
memory map for each processors anda system wide global 
memory map. 

2. Core Image Load Stage 

The core image load stage involves first loading the 
xernel portion of the operating system, and then loading the 
supervisor portion. The bootload CPU loads the Kernel (of 
SASS) into a available global address that it determines 
from the system global memory map. After the core image is 
loaded, the bootload CPU sets the message tlock of each 
member CPU to (1) the address where the core image was 
loaded, (2) the address in local memory to where it is to be 
moved, anit (3) the number of bytes to be moved. The bootload 
CPU next downloads the core image into its own local memory 
and signals the other processors to download as well. 

The second task cf the ccre image load stage is to 
Meade the Supervisor, into an avoropriate location in memory. 
In this implementation the Kernel and Supervisor addresses 
are fixed at systen generation time since the core image 
contains absolute addressing: future S4SS implemertations 
presumably need not contain atsolute addresses and the core 
image load addresses conld be determined dynamically at run 


time. 
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meee RUN TIME INITIALIZATION 

With the current implementation no operating system data 
Structures are initialized within the bootstrap program. 
Some of these functions are currently performed at the start 
of execution of the S4SS core image. For exmaple, 
Strickler{7] in his work assembled these functions into one 
‘bootstrap loader” module which basically initializes the 
imeacaestructures used by the Inner Traffic Controller for 
processor management. After this level of the operating 
System is initialized, the next level, the Drake c 
Cortroller for process management, is initialized, and so 
forth in a layered manner. These initalization tasks are 
based on the minimal configuration assumed by the base layer 
of the operating system, and as such are more related to the 
core image of the operating system than to the hardware 
Semcaeuration. 

Those initialization tasks that create and intializes 
operating system databases used for the purposes of resource 
fioeaeement, are considered as run time initialization. The 
resources defined in this operating system include primary 
and secondary storage, processors, processes, and memory 
segments. Of these, the ones associated with the minimal 
configuration are processors and memory. This appears to be 
generally the case with most operating systems; therefore 
fmeeeecode for initializing the data structure containing 


Anowledge of the vrocessors and local/global storage should, 
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as in this implementation, be included within the bootstrap 
program. The databases for management of any other 
resources, such as the Active Process Table or Virtual 
Processor Table used in $455, should be initialized during 
Bempime initialization or during the layering of the 
operating system. In the SASS example, this refers to 
processes ard memory segments. 

meague to SASS is a different resource that is directly 
managed by process management. This resource is the Yost 
systems. These are known to the hardware architecture 
through I/O ports and interfaces which are hardware devices. 
These devices must be initialized before communication can 
Memestvanlished with the Host systems. The existence of these 
ports cannot be as easily ‘discovered as were the 
processors and memory. Each must be initialized prior to any 
interaction with the processors. Though the number cf ports 
Or interfaces is not known at system generation time, the 
manner in which they are addressed or communicated with is 
known. A shotgun approach to device initialization can be 
taken throughout this addressing range. A similar action is 
performed in Blolal implementation aur ine hardware 
mmevialization where the MCZ port is initialized by all 
Meecessors for the possibility that the MCZ system is 


attached. 
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Once the lines of communication are open, the Host 
Systems car be communicated with. However, each Host system 
meee ar access class assigned to its process abdstraction. 
These must be assigned prior to any communications with the 
Host systems, and therefore must be associated with the 
Bomos. At some point between the initialization of the ports 
and communication with the Hosts systems, an association of 
these logical attributes to the appropriate port must be 
made. Since an upperbound on the number of ports is implicit 
in the addressing method, a database containing an entry for 
each possible Bost system can be constructed at system 
zeneration time within the core image. This table at system 
generation time would statically show no existing Host 
systems. 

The addition or subtraction of Eost systems becomes 
effective during the initializatior phase, primarily in the 
bootstrap program, when the system resource knowledge is 
consolidated. It is at this point that the knowledge of 
memory resources and processor resources is passed to the 
base layer of the operating system. Any knowledge of Host 
system additions or deletions should also be passed at this 
point. Since these changes require operator intervention, a 
separate cold-boot™ bootstrap program having an interactive 
code section at this voint, must be used to facilitate the 
Changes. The knowledge of these changes, or of no changes, 


can then be used by the base layer of the operating system 


99 





to update its Host system database, which resides with the 
operating system core image on secondary storage, during run 


meme initialization. 


S. SUMMARY 


a 


RM detaited description of the bootload and bootstrap 
programs which effected initialization of the SASS was 
presented in this chapter. The implementation design adhered 
to the general initialization design presented in chapter 
Pri. with any differences pointed out. Those sections of the 
programs having general apdlication were distinguished from 
maewsystem dependent sections; and the reasons for each were 
explained. 

In addition, a nethod for the treatment of Host systems 
as resources was oresented anda possible implementation 
described. The actual implementation is dependent or S455 


design issues not yet addressed. 
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V. CONCLUSIONS 


Pee OUMMARY OF RESULTS 

fais thesis has presented a jee! proces SOT 
Paetidlization design for dynamic determination of resources 
in an adaptive manner. Thre design is general in nature and 
therefore applicable to a wide range of hardware 
architectures and operating systems. The three phases of 
initialization were addressed and two program mediums were 
identified. the bootload program which comorises the 
firmware, and the bootstrap program contained on secondary 
storage. The bootload phase was oroken down into earoups of 
tasks or stages: (1) Independent Processor Stage, (2) 
Cooperating Processor Stage, (Z) Local Initialization Stage, 
‘4) Global Initialization Stage, and {‘5) Core Image Load 
Stage. The independent and cooperating processor stages and 
the local initialization stage make up the bootload program 
which resides in RCM. The remaining stages comprise the 
meorestrap program. 

The independent processor stage dynamically determines 
the existence of local and eglobal memory and other 
processors. while operating ina variable free environment 
(viz., no RAM available). In the cooperating processors 
stage each processor provides to the system in a coordinated 


manner, the knowledge of its own unique identification from 
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firrware and a complete local and global memcry map. Each 
meocessor completes its own initialization functions in the 
local initialization stage. Thess functions include setting 
the internal special purvose registers, establishing its own 
Mreeoaeta Structures, and completing the initialization of 
its own hardware devices. In the zlobal initialization stage 
the resource knowledge providel by each processor is 
consolidated into the system databases which establish the 
minimal configuration for the operating system tase layer. 
The core image load stage effects the loading of the locai 
and global sections of the operating system and starts tnem 
Bone se . 

Meee sSierifticant features of the désien include the 
general aodvlicability of the design, the dynamic resource 
mappanre scheme, and the hardware synchronization mechanism. 
[Ime general nature of the design is tased on independence of 
the program modules, i.e. bootload, bootstrap, and operating 
system base layer. No knowledze of one is assumed by the 
other at system generation time, rather the knowledge is 
dynamically passed between units. Dynamic resource mapping 
is performed on the processors and primary storage, to 
support the minimal Soni. aira tl on. The hardware 
synchronization method maxes use of a glotal datatase known 
aemmeeoe cOnfiguretion table to facilitate interprocessor 


communication by a randomly selected controlling orocessor 
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aiaot load CPU) that syichronizes processors and interfaces 
meconcary storage. 

An implementation using this initialization design was 
accomplished for a member (SASS) of a family of secure, 
multiprogramming, nulti-microcomnuter operating systems. The 
implementation was used to effect a Tunmiee., SASS 


demonstration module in a single processor environment. 


Bee FOLLOW ON WORK 

SASS initialization as provided by this implementation 
will have to undergo several modifications before the final 
version will exist. An effort was made to concentrate those 
areas requiring future change into one program medium, the 
bootstrap program. The firnware should require only minor 
modifications, affecting only the secondary storage 
interfacing primitives. The bootstrap program must be 
modified as the evolution of the SASS progresses. Those 
areas for modification are discussed in chapter I¥. No 
cenit icant follow on work to this implementation is 
meeuired. 

The SASS system provides possible areas in both the 
hardware architecture and overating system that would be 
Suitable for inmediate continued research. In the area of 
kardware, the selection and interfacing of a suitable 
secondary storage device to the DASS Developmental 


Architecture is of most immediate concern. This modification 
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fmoued facilitate the full hardware architecture realization 
and would thereby pernit further implementable work in the 
S4SS in the area of the Supervisor. Another hardware concern 
is the manner of Fost computer systems interfacing. The 
4m96/4116 MonoBoard supports two serial and one parallel 
Meese fOr this use. The use of fiber optics to interconnect 
rost systems to these perts may be of special interest from 
a security standvoint as a deterent to signal interception. 
In the area of SASS, several areas are of immediate 
morerest. the completion of the run time initialization 
stage as discussed in chavter IVY is required before a 
multiprocessor version of SASS can be attained. Further work 
in the “fernel includes the actual implementation of the 
memory manager process fer resource management. The 
implementation of the Supervisor has mot been addressed to 
fate. Its areas of research include the implementation of 
the File Manager and I/O processes, and the final design and 
implementation of the SASS—Hosts OTS TOC O US. Another 
interesting area could be the use of the idle process to 
perform Some useful work. Some of the functions performed 
during initialization could again be used as preventative 
diagnosis by the idle process, to provide a measure of fault 
moterance. Other imherestines areas include the 
implementation of dynamic process creation, deletion ani 


loading, ard the support of multilevel Yost systems. 
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